Computing Paraphrasability of Syntactic Variants Using Web Snippets

نویسندگان

Atsushi Fujita

Satoshi Sato

چکیده

In a broad range of natural language processing tasks, large-scale knowledge-base of paraphrases is anticipated to improve their performance. The key issue in creating such a resource is to establish a practical method of computing semantic equivalence and syntactic substitutability, i.e., paraphrasability, between given pair of expressions. This paper addresses the issues of computing paraphrasability, focusing on syntactic variants of predicate phrases. Our model estimates paraphrasability based on traditional distributional similarity measures, where the Web snippets are used to overcome the data sparseness problem in handling predicate phrases. Several feature sets are evaluated through empirical experiments.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Distinct Answers in Web Snippets

This paper presents ListWebQA, a question answering system aimed specifically at discovering answers to list questions in web snippets. ListWebQA retrieves snippets likely to contain answers by means of a query rewriting strategy, and extracts answers according to their syntactic and semantic similarities afterwards. These similarities are determined by means of a set of surface syntactic patte...

متن کامل

Mining Web Snippets to Answer List Questions

This paper presents ListWebQA, a question answering system that is aimed specifically at extracting answers to list questions exclusively from web snippets. Answers are identified in web snippets by means of their semantic and syntactic similarities. Initial results show that they are a promising source of answers to list questions.

متن کامل

International Journal of Soft Computing and Engineering

Semantic similarity measures play an important role in the extraction of semantic relations. Semantic similarity measures are widely used in Natural Language Processing (NLP) and Information Retrieval (IR). The work proposed here uses web based metrics to compute the semantic similarity between words or terms and also compares with the state-of-the-art. For a computer to decide the semantic sim...

متن کامل

Using the Web as a Corpus for the Syntactic-Based Collocation Identification

This paper presents an experiment that uses a Web search engine and a robust parser for the Web-based identification of collocations (statistically significant word associations representing “a conventional way of saying things” (Manning and Schütze, 1999)). We identify the possible collocates of a given word by parsing the text snippets returned by the search engine when querying that word. Th...

متن کامل

Leveraging Flawed Tutorials for Seeding Large-Scale Web Vulnerability Discovery

The Web is replete with tutorial-style content on how to accomplish programming tasks. Unfortunately, even top-ranked tutorials suffer from severe security vulnerabilities, such as cross-site scripting (XSS), and SQL injection (SQLi). Assuming that these tutorials influence real-world software development, we hypothesize that code snippets from popular tutorials can be used to bootstrap vulnera...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Computing Paraphrasability of Syntactic Variants Using Web Snippets

نویسندگان

چکیده

منابع مشابه

Finding Distinct Answers in Web Snippets

Mining Web Snippets to Answer List Questions

International Journal of Soft Computing and Engineering

Using the Web as a Corpus for the Syntactic-Based Collocation Identification

Leveraging Flawed Tutorials for Seeding Large-Scale Web Vulnerability Discovery

عنوان ژورنال:

اشتراک گذاری